A comparison of several methods for analyzing censored data.
نویسندگان
چکیده
The purpose of this study was to compare the performance of several methods for statistically analyzing censored datasets [i.e. datasets that contain measurements that are less than the field limit-of-detection (LOD)] when estimating the 95th percentile and the mean of right-skewed occupational exposure data. The methods examined were several variations on the maximum likelihood estimation (MLE) and log-probit regression (LPR) methods, the common substitution methods, several non-parametric (NP) quantile methods for the 95th percentile and the NP Kaplan-Meier (KM) method. Each method was challenged with computer-generated censored datasets for a variety of plausible scenarios where the following factors were allowed to vary randomly within fairly wide ranges: the true geometric standard deviation, the censoring point or LOD and the sample size. This was repeated for both a single-laboratory scenario (i.e. single LOD) and a multiple-laboratory scenario (i.e. three LODs) as well as a single lognormal distribution scenario and a contaminated lognormal distribution scenario. Each method was used to estimate the 95th percentile and mean for the censored datasets (the NP quantile methods estimated only the 95th percentile). For each scenario, the method bias and overall imprecision (as indicated by the root mean square error or rMSE) were calculated for the 95th percentile and mean. No single method was unequivocally superior across all scenarios, although nearly all of the methods excelled in one or more scenarios. Overall, only the MLE- and LPR-based methods performed well across all scenarios, with the robust versions generally showing less bias than the standard versions when challenged with a contaminated lognormal distribution and multiple LODs. All of the MLE- and LPR-based methods were remarkably robust to departures from the lognormal assumption, nearly always having lower rMSE values than the NP methods for the exposure scenarios postulated. In general, the MLE methods tended to have smaller rMSE values than the LPR methods, particularly for the small sample size scenarios. The substitution methods tended to be strongly biased, but in some scenarios had the smaller rMSE values, especially for sample sizes <20. Surprisingly, the various NP methods were not as robust as expected, performing poorly in the contaminated distribution scenarios for both the 95th percentile and the mean. In conclusion, when using the rMSE rather than bias as the preferred comparison metric, the standard MLE method consistently outperformed the so-called robust variations of the MLE-based and LPR-based methods, as well as the various NP methods, for both the 95th percentile and the mean. When estimating the mean, the standard LPR method tended to outperform the robust LPR-based methods. Whenever bias is the main consideration, the robust MLE-based methods should be considered. The KM method, currently hailed by some as the preferred method for estimating the mean when the lognormal distribution assumption is questioned, did not perform well for either the 95th percentile or mean and is not recommended.
منابع مشابه
Comparison of three Estimation Procedures for Weibull Distribution based on Progressive Type II Right Censored Data
In this paper, based on the progressive type II right censored data, we consider estimates of MLE and AMLE of scale and shape parameters of weibull distribution. Also a new type of parameter estimation, named inverse estimation, is introdued for both shape and scale parameters of weibull distribution which is used from order statistics properties in it. We use simulations and study the biases a...
متن کاملPenalized Estimators in Cox Regression Model
The proportional hazard Cox regression models play a key role in analyzing censored survival data. We use penalized methods in high dimensional scenarios to achieve more efficient models. This article reviews the penalized Cox regression for some frequently used penalty functions. Analysis of medical data namely ”mgus2” confirms the penalized Cox regression performs better than the cox regressi...
متن کاملExploring the Method for Analyzing Interval Censored Data Using Imputation in Competing Risks Model
We consider the problem of analyzing interval censored data comparing cumulative incidence functions by demographic variables in the presence of competing risks. In this paper, we explore two methods based on imputation, the EM-type method and Multiple Imputation. Basically, we imputed the exact event time for interval censored data and take advantage of standard estimation methods for right ce...
متن کاملPrediction of Times to Failure of Censored Units in Hybrid Censored Samples from Exponential Distribution
In this paper, we discuss different predictors of times to failure of units censored in a hybrid censored sample from exponential distribution. Bayesian and non-Bayesian point predictors for the times to failure of units are obtained. Non-Bayesian prediction Intervals are obtained based on pivotal and highest conditional density methods. Bayesian prediction intervals are also proposed. One real...
متن کاملBayesian Estimation of Reliability of the Electronic Components Using Censored Data from Weibull Distribution: Different Prior Distributions
The Weibull distribution has been widely used in survival and engineering reliability analysis. In life testing experiments is fairly common practice to terminate the experiment before all the items have failed, that means the data are censored. Thus, the main objective of this paper is to estimate the reliability function of the Weibull distribution with uncensored and censored data by using B...
متن کاملEstimation of Parameters for an Extended Generalized Half Logistic Distribution Based on Complete and Censored Data
This paper considers an Extended Generalized Half Logistic distribution. We derive some properties of this distribution and then we discuss estimation of the distribution parameters by the methods of moments, maximum likelihood and the new method of minimum spacing distance estimator based on complete data. Also, maximum likelihood equations for estimating the parameters based on Type-I and Typ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- The Annals of occupational hygiene
دوره 51 7 شماره
صفحات -
تاریخ انتشار 2007